A Brief Overview on Runtime-Aware Architectures

نویسندگان

  • Marc Casas
  • Miquel Moretó
  • Eduard Ayguadé
  • Jesus Labarta
  • Mateo Valero
چکیده

When uniprocessors were the norm, Instruction Level Parallelism (ILP) and Data Level Parallelism (DLP) were widely exploited to increase the number of instructions executed per cycle. The main hardware designs that were used to exploit ILP were superscalar and Very Long Instruction Word (VLIW) processors. The VLIW approach implies statically figuring out dependencies between instructions and scheduling them. However, since it is not possible in general to obtain optimal scheduling’s at compile time, VLIW does not fully exploit the potential ILP that many workloads have. Superscalar designs try to overcome the increasing memory latencies, the so called Memory Wall [8], by using Out of Order (OoO) and speculative executions [3]. Additionally, techniques such as prefetching, to start fetching data from the memory ahead of time, deep memory hierarchies, to exploit the locality that many programs have, and large reorder buffers, to increase the number of speculative instructions exposed to the hardware, have been also used to enhance superscalar processors performance. DLP is typically expressed explicitly at the software layer and it consisted in a parallel operation on multiple data performed by multiple independent instructions, or by multiple independent threads. In uniprocessors, the Instruction Set Architecture (ISA) was in charge of decoupling the application, written in a high-level programming language, and the hardware. In this context, the architecture improvements were applied at the pipeline level without changing the ISA.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reliability and Performance Evaluation of Fault-aware Routing Methods for Network-on-Chip Architectures (RESEARCH NOTE)

Nowadays, faults and failures are increasing especially in complex systems such as Network-on-Chip (NoC) based Systems-on-a-Chip due to the increasing susceptibility and decreasing feature sizes. On the other hand, fault-tolerant routing algorithms have an evident effect on tolerating permanent faults and improving the reliability of a Network-on-Chip based system. This paper presents reliabili...

متن کامل

Autotuning of Pattern Runtimes for Accelerated Parallel Systems

Parallel architectures with node-level accelerators promise significant performance improvements over conventional homogeneous systems. To cope with the increased complexity of programming such systems various pattern-based programming libraries have become available. In this paper we present our work on providing autotuning capabilities for two runtime libraries that provide parallel programmi...

متن کامل

A Routing-Aware Simulated Annealing-based Placement Method in Wireless Network on Chips

Wireless network on chip (WiNoC) is one of the promising on-chip interconnection networks for on-chip system architectures. In addition to wired links, these architectures also use wireless links. Using these wireless links makes packets reach destination nodes faster and with less power consumption. These wireless links are provided by wireless interfaces in wireless routers. The WiNoC archite...

متن کامل

Design of a novel congestion-aware communication mechanism for wireless NoC architecture in multicore systems

Hybrid Wireless Network-on-Chip (WNoC) architecture is emerged as a scalable communication structure to mitigate the deficits of traditional NOC architecture for the future Multi-core systems. The hybrid WNoC architecture provides energy efficient, high data rate and flexible communications for NoC architectures. In these architectures, each wireless router is shared by a set of processing core...

متن کامل

Characterization of Locality Aware Task Scheduling Mechanism

The architectural features of modern computers highlight the need of parallel programming for sustained performance. This paper deals with task based programming to program modern computers. Due to lack of data locality, communication optimization and lack of task characterization support in an existing task scheduling, we intends to overview the characterization of locality aware task scheduli...

متن کامل

The PEPPHER Approach to Programmability and Performance Portability for Heterogeneous many-core Architectures

The European FP7 project PEPPHER is addressing programmability and performance portability for current and emerging heterogeneous many-core architectures. As its main idea, the project proposes a multi-level parallel execution model comprised of potentially parallelized components existing in variants suitable for different types of cores, memory configurations, input characteristics, optimizat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015